feat(sdk): harden pkg/kit embedder surface with scoped additions#69
Conversation
Implements 11 of the 14 items from the embedder SDK hardening request, all additive (no existing signature changes): - guard config.configPath global with a mutex; add GetConfigPath - extend PromptOptions with per-call model/thinking/tools/provider overrides and add PromptResultWithOptions - capture per-turn stream deltas in TurnResult.Stream ([]StreamEvent) - add ToolOutput.Halt/FinalValue, surfaced as TurnResult.HaltedByTool/ FinalValue for structured-result extraction - add provider-error sentinels + ClassifyProviderError, wired into the turn error path - add NewRawTool schema-driven untyped tool constructor - add CompactionEvent.Err and emit it on the compaction failure path - promote AppendBranchSummary onto SessionManager with ErrBranchSummaryNotSupported; drop the type-assertion fallback - add CloseContext for deadline-bounded shutdown - add ResetForTesting behind the testing build tag - add LoadSkillsFromFS (fs.FS-typed skill loader) Includes unit tests and docs-site coverage for each addition. Refs #68
|
Connected to Huly®: KIT-70 |
|
Warning Review limit reached
More reviews will be available in 9 minutes and 48 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds fourteen scoped SDK capabilities: a concurrency-safe config path with ChangesKit SDK Feature Expansion
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Kit as Kit.PromptResultWithOptions
participant runTurn
participant generate
participant streamCollector
participant haltHolder
participant ClassifyProviderError
rect rgba(100, 149, 237, 0.5)
note over Kit,runTurn: Per-call override apply/restore (promptOptsMu)
Caller->>Kit: PromptResultWithOptions(ctx, msg, opts)
Kit->>Kit: applyPromptOptions(opts) — save state, set overrides
end
Kit->>runTurn: run turn
runTurn->>runTurn: attach streamCollector + haltHolder to ctx
runTurn->>generate: generate(ctx, ...)
generate->>streamCollector: add StreamEvent (text/reasoning/tool chunk)
generate->>haltHolder: recordHalt via tool callback
generate-->>runTurn: done or error
alt provider error
runTurn->>ClassifyProviderError: classify(err)
ClassifyProviderError-->>runTurn: sentinel-wrapped error
runTurn-->>Kit: error
else success
runTurn->>streamCollector: drain() → TurnResult.Stream
runTurn->>haltHolder: snapshot() → TurnResult.HaltedByTool, FinalValue
runTurn-->>Kit: TurnResult
end
Kit->>Kit: restorePromptOptions() — restore saved state
Kit-->>Caller: TurnResult, error
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/kit/kit.go (1)
2359-2361:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEmit
StreamEventReasoningStarton reasoning-start callbacks.
OnReasoningStartcurrently emits onlyReasoningStartEvent;TurnResult.Streammisses the start marker on providers using reasoning callbacks (non-<think>path).Proposed fix
OnReasoningStart: func(id string) { m.events.emit(ReasoningStartEvent{ID: id}) + collector.add(StreamEvent{Kind: StreamEventReasoningStart}) },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/kit/kit.go` around lines 2359 - 2361, The OnReasoningStart callback function currently only emits ReasoningStartEvent but needs to also emit StreamEventReasoningStart to ensure TurnResult.Stream has the proper start marker for providers using reasoning callbacks. In addition to the existing m.events.emit call with ReasoningStartEvent, add another m.events.emit call to emit StreamEventReasoningStart with the same ID parameter to properly mark the start of the stream event on the non-think path.
🧹 Nitpick comments (3)
pkg/kit/errors_test.go (1)
37-41: ⚡ Quick winUse identity check for “unchanged error” assertion.
For the unclassified path, comparing
Error()text can hide wrapping/regression. Checking pointer identity verifies the contract directly.✅ Suggested test tweak
- if tc.want == nil { - // Unclassified errors are returned unchanged. - if got.Error() != tc.in.Error() { - t.Fatalf("expected unchanged error, got %v", got) - } - return - } + if tc.want == nil { + // Unclassified errors are returned unchanged. + if got != tc.in { + t.Fatalf("expected unchanged error instance, got %v", got) + } + return + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/kit/errors_test.go` around lines 37 - 41, The unclassified error assertion in the test is comparing error text representations using the Error() method, which can mask wrapping or modifications to the error object. Replace the comparison of Error() string values with a direct pointer identity check comparing got and tc.in directly, so the test verifies that the exact same error object is returned unchanged rather than just checking that the text representation matches.pkg/kit/rawtool_test.go (1)
58-72: ⚡ Quick winAdd a regression case for whitespace-
nullinput.Given
NewRawToolacceptsnullas “no args”, add a case for" null "to lock consistent behavior and prevent nil-map regressions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/kit/rawtool_test.go` around lines 58 - 72, Add a regression test case within TestNewRawToolInvalidArgs or create a new test function to verify that NewRawTool handles whitespace-padded null input (such as " null " with surrounding spaces) consistently with plain null input. The test should call tool.Run with this whitespace-null input and verify that the handler is not invoked and the response indicates proper error handling, ensuring nil-map parsing behavior remains consistent across different null input variations.pkg/kit/kit.go (1)
2801-2802: ⚡ Quick winWrap returned errors with operation context.
These returns drop call-site context. Wrap with
fmt.Errorf("...: %w", err)for debuggability and consistency.Proposed fix
- return nil, err + return nil, fmt.Errorf("set model with prompt overrides: %w", err) @@ - return nil, err + return nil, fmt.Errorf("apply prompt options: %w", err)As per coding guidelines,
Always check errors and wrap them with fmt.Errorf("context: %w", err).Also applies to: 2843-2845
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/kit/kit.go` around lines 2801 - 2802, Error returns at lines 2801-2802 and 2843-2845 lack operation context for debugging. Wrap each bare error return statement (where `return nil, err` or similar appears) with `fmt.Errorf` to add descriptive context about what operation failed, using the pattern `fmt.Errorf("operation description: %w", err)`. This applies to all error return statements in the affected code sections to maintain consistency with project coding guidelines.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/skills/skills.go`:
- Around line 184-185: The errs variable declared at line 184 is a slice of
strings ([]string), and at lines 205, 210, and 220, errors are being converted
to strings before appending, which destroys the error chain and breaks
errors.Is/As functionality downstream. Change the errs variable from a slice of
strings to a slice of error types ([]error), preserve the actual error objects
when appending to this slice instead of converting them to strings, ensure
individual errors are wrapped with appropriate context using fmt.Errorf with %w
wrapping, and ensure the final aggregated error properly maintains the error
chain so downstream consumers can use errors.Is and errors.As correctly.
In `@pkg/kit/compaction.go`:
- Around line 119-124: The error handling block in the compactInternal function
does not wrap the error returned from m.compactImpl with context information.
Wrap the error using fmt.Errorf with an appropriate contextual message before
using it in both the CompactionEvent emission and the return statement. Use the
%w verb to preserve the original error while adding call-site context that will
aid debugging for subscribers and callers.
In `@pkg/kit/errors.go`:
- Around line 33-40: The documentation for the ClassifyProviderError function
claims "The original error remains reachable via errors.Unwrap", but this is
incorrect because the standard errors.Unwrap function only works with Unwrap()
error (returning a single error), not Unwrap() []error (returning multiple
errors). Verify the actual implementation of the error type returned by
ClassifyProviderError and ensure it properly implements the Unwrap method
according to the standard library contract. If the function returns an error
that implements Unwrap() []error, either change it to implement Unwrap() error
returning a single unwrapped error, or update the documentation to accurately
reflect what can be retrieved via errors.Unwrap.
In `@pkg/kit/kit.go`:
- Around line 2803-2808: The restore closure in the defer block calls
m.SetModel(ctx, prevModel) using the caller's context, which may be canceled or
expired by the time the defer executes, causing the error to be silently ignored
with an underscore. Replace the caller's context with a background context or a
new context that won't be canceled when calling m.SetModel in the restore
function, ensuring the rollback can complete successfully and prevent per-call
overrides from leaking into subsequent calls.
- Around line 2796-2800: Replace the direct m.v.Set() calls for the three
configuration keys (thinking-level, provider-url, and provider-api-key) with the
restoreViperString helper function to properly respect the prev*Set flags that
were saved at the start of the block. Instead of calling
m.v.Set("thinking-level", prevThinking), m.v.Set("provider-url", prevURL), and
m.v.Set("provider-api-key", prevKey) directly, use restoreViperString for each
key with its corresponding previous value and flag, matching the pattern already
used at lines 2803-2806 for the restore() operation.
In `@pkg/kit/sessions.go`:
- Around line 220-227: The CollapseBranch method in the Kit type accepts a toID
parameter that is not used in the implementation, since AppendBranchSummary
always collapses to the current leaf. Add godoc documentation to the
CollapseBranch function that clarifies the toID parameter is unused and
explicitly documents that the method always collapses branches to the current
leaf regardless of what toID value is passed, to prevent callers from being
confused about the actual behavior.
In `@pkg/kit/tools.go`:
- Around line 5-8: The null check comparing `string(input) != "null"` is brittle
because it doesn't handle whitespace variants like " null ", which can cause
json.Unmarshal to unexpectedly set args to nil. Update the null check logic by
using strings.TrimSpace to remove leading and trailing whitespace from the input
before comparing it to "null". Apply this fix to all occurrences of the brittle
null comparison, including the locations around the import section and the
also-applies-to range mentioned in the comment.
In `@www/pages/sdk/sessions.md`:
- Line 102: The text in the SessionManager documentation section contains an
incorrect reference labeled "SDK skill reference" that links to a repository and
appears outdated for this context. Replace the misleading "SDK skill reference"
link text and URL with an appropriate reference that correctly documents the
SessionManager interface definition, ensuring it accurately reflects the actual
documentation or interface definition location being referenced.
---
Outside diff comments:
In `@pkg/kit/kit.go`:
- Around line 2359-2361: The OnReasoningStart callback function currently only
emits ReasoningStartEvent but needs to also emit StreamEventReasoningStart to
ensure TurnResult.Stream has the proper start marker for providers using
reasoning callbacks. In addition to the existing m.events.emit call with
ReasoningStartEvent, add another m.events.emit call to emit
StreamEventReasoningStart with the same ID parameter to properly mark the start
of the stream event on the non-think path.
---
Nitpick comments:
In `@pkg/kit/errors_test.go`:
- Around line 37-41: The unclassified error assertion in the test is comparing
error text representations using the Error() method, which can mask wrapping or
modifications to the error object. Replace the comparison of Error() string
values with a direct pointer identity check comparing got and tc.in directly, so
the test verifies that the exact same error object is returned unchanged rather
than just checking that the text representation matches.
In `@pkg/kit/kit.go`:
- Around line 2801-2802: Error returns at lines 2801-2802 and 2843-2845 lack
operation context for debugging. Wrap each bare error return statement (where
`return nil, err` or similar appears) with `fmt.Errorf` to add descriptive
context about what operation failed, using the pattern `fmt.Errorf("operation
description: %w", err)`. This applies to all error return statements in the
affected code sections to maintain consistency with project coding guidelines.
In `@pkg/kit/rawtool_test.go`:
- Around line 58-72: Add a regression test case within TestNewRawToolInvalidArgs
or create a new test function to verify that NewRawTool handles
whitespace-padded null input (such as " null " with surrounding spaces)
consistently with plain null input. The test should call tool.Run with this
whitespace-null input and verify that the handler is not invoked and the
response indicates proper error handling, ensuring nil-map parsing behavior
remains consistent across different null input variations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cac61dfc-3076-466f-85e5-9a2f8d3370a6
📒 Files selected for processing (24)
internal/agent/agent.gointernal/config/config.gointernal/config/configpath_test.gointernal/skills/skills.gointernal/skills/skills_fs_test.gopkg/kit/README.mdpkg/kit/adapter.gopkg/kit/compaction.gopkg/kit/errors.gopkg/kit/errors_test.gopkg/kit/events.gopkg/kit/kit.gopkg/kit/rawtool_test.gopkg/kit/session.gopkg/kit/sessions.gopkg/kit/skills.gopkg/kit/testing.gopkg/kit/tools.gopkg/kit/turn_capture.gopkg/kit/turn_capture_test.gowww/pages/advanced/json-output.mdwww/pages/sdk/callbacks.mdwww/pages/sdk/overview.mdwww/pages/sdk/sessions.md
- skills: aggregate load errors as []error with errors.Join + %w so the chain stays inspectable via errors.Is/As (both LoadSkillsFromDir and LoadSkillsFromFS) - errors: correct ClassifyProviderError godoc — sentinelError implements Unwrap() []error, which errors.Unwrap does not traverse; reference errors.Is instead (also fixed in the docs site) - kit: use restoreViperString on the applyPromptOptions error path so previously-unset config keys are cleared, not forced to "" - kit: roll back the per-call model override with context.Background() so a canceled caller ctx can't leak the override into later calls - sessions: document that CollapseBranch's toID is unused and the branch always collapses to the current leaf - docs: retarget the SessionManager link to its pkg.go.dev definition Refs #68
Normalise the raw input with bytes.TrimSpace before the null/empty check so " null ", "\tnull\n", etc. follow the same skip-unmarshal path as the bare "null" and the handler always receives a non-nil empty map. Removes an implicit dependency on fantasy trimming the value during its RawMessage decode. Adds a contract test for the null-variant inputs. Refs #68
Description
Hardens the
pkg/kitembedder surface with a set of scoped, purely additivechanges that unblock building higher-level agent frameworks on top of Kit. No
existing exported signature changes — every item is a new field, new exported
symbol, or an interface method with a default implementation on the shipped
managers, so zero-valued defaults preserve current behavior.
This implements 11 of the 14 items from the embedder SDK hardening request
(#68). The remaining three (#2 pluggable provider injection, #7
LLMUsage.Extra,#13
Kit.Fork) are deferred because they require a breaking change or anagent-core/architecture shift that warrants maintainer sign-off; they are
tracked on the issue.
Example — per-call overrides without rebuilding a
*Kit, plus structured-resultextraction and typed error handling:
Refs #68
Type of Change
What's included
Tier 1 — removes embedder workarounds / feature drops:
config.configPathpackage global with async.RWMutexandadd
GetConfigPath; route all readers through it (fixes a-racedata racewhen concurrent
kit.New()calls discover a.kit.yml).PromptOptionswith per-callModel,ThinkingLevel,ExtraTools,ProviderURL,ProviderAPIKeyoverrides (scoped apply/restore,serialized by a mutex) and add
PromptResultWithOptions.TurnResult.Stream([]StreamEvent)in emit order; clarify that
PromptResultblocks until end-of-turn.Tier 2 — simplify embedder code:
ToolOutput.Halt/FinalValue, surfaced asTurnResult.HaltedByTool/FinalValueso structured-result patterns no longer need a side-channel.errors.gowith provider-error sentinels (ErrContextOverflow,ErrRateLimit,ErrAuth,ErrProviderUnavailable,ErrInvalidRequest) plusClassifyProviderError, wired into the turn error path.NewRawTool— schema-driven untyped tool constructor for tools sourcedfrom JSON Schema (skill files, MCP catalogs).
CompactionEvent.Errand emit the event on the failure path too.AppendBranchSummaryonto theSessionManagerinterface withErrBranchSummaryNotSupported; drop thetreeManagerAdaptertype-assertionfallback in
CollapseBranch.Tier 3 — papercuts:
CloseContext(ctx)for deadline-bounded shutdown (Close()delegates).ResetForTesting()behind thetestingbuild tag.LoadSkillsFromFS(fsys fs.FS, root string)—fs.FS-typed skill loadersharing the parser with the path-based loaders.
Checklist
gofmt,go vet,golangci-lint)go test -race ./...)Additional Information
PromptOptions/ToolOutput/CompactionEventgain new fields (zero value = prior behavior);Close()isunchanged and now delegates to
CloseContext.SessionManagergainsAppendBranchSummary. The shippedtreeManagerAdapterimplements it; custom managers should returnErrBranchSummaryNotSupportedif unsupported.internal/config/configpath_test.go(race),internal/skills/skills_fs_test.go,pkg/kit/errors_test.go,pkg/kit/rawtool_test.go,pkg/kit/turn_capture_test.go.pkg/kit/README.mdand the docs site (www/pages/sdk/*,www/pages/advanced/json-output.md);npx tome buildpasses.LLMUsage.Extra(breaking — it's a type alias for the external
fantasy.Usage), Ctrl+C exits the app instead of clearing the current input #13Kit.Fork.Summary by CodeRabbit
Release Notes
FinalValue,HaltedByTool).NewRawToolwith JSON argument validation.PromptOptionsandPromptResultWithOptions.LoadSkillsFromFSfor embedded skills discovery.AppendBranchSummary/CollapseBranch) with a dedicated “not supported” error sentinel.CloseContext).